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cally establish the categories of search results. The catego- 
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PRESENTATION OF SEARCH RESULTS 
USING DYNAMIC CATEGORIZATION 

FIELD OF THE INVENTION 

The present invention relates to information retrieval, and 
more specifically, to an approach for presenting search 
results using dynamic categorization, 

BACKGROUND OF THE INVENTION 

Information systems provide for the storage, retrieval and 
sometimes management of data. Information is typically 
retrieved from an information system by submitting a query 
to the information system, where the query specifies a set of 
retrieval criteria. The information system processes the 
query against a database and provides data that satisfies the 
search criteria (search results) to a user. 

The form of search results depends upon the context in 
which a particular search is performed. For example, in the 
context of a database search, search results might consist of 
a set of rows from a table. In the context of the global 
information network known as the "Internet", the search 
results might consist of links to web pages. 

For the purpose of explanation, the specific data items 
against which a search query is executed are referred to 
herein as searchable data items. The set of all searchable data 
items against which a query is executed is referred to herein 
as the searchable data set. The specific searchable data items 
that satisfy a particular query are referred to herein as 
matching data items. The set of all matching data items for 
a given query are referred to herein as the search results of 
the query. 

Processing a query containing general or generic search 
terms against a large searchable data set can result in a large 
number of unorganized matching data items, sometimes 
referred to as "hits." For example, processing a query 
containing general or generic terms on the Internet can 
generate millions of hits. 

On the Internet, search queries are processed by search 
tools known as "search engines" that typically present a 
sequential list of matching data items ranked by relevance, 
from most relevant to least relevant. As a result, the match- 
ing data items that best satisfy the search criteria are 
presented at the top of the list, with the other matching data 
items presented further down the list in order of decreasing 
relevance. For example, web pages or web sites with web 
pages that contain the greatest number of the search terms 
receive the highest relevance ranking and are presented at 
the top of the list. 

Because the search results are presented serially, with 
approximately ten to twenty hits per page, reviewing a large 
number of hits, for example several thousand, or even only 
several hundred hits, is often impractical. This is not nec- 
essarily a problem in situations where the relevancy ranking 
drops off quickly after a relatively few number of hits 
because a user will typically only view the most relevant 
matching data items. However, in situations where a large 
number of hits have a high relevancy ranking, it can be 
impractical to review all of the most relevant hits. 

One alternative approach for presenting search results is 
the static category approach. The static category approach 
involves pre-assigning all searchable data items to pre- 
defined or "static" subject matter categories based upon their 
content When a search is performed, a relatively fewer 
number of categories that satisfy the search criteria are 
displayed instead of or, in addition to, the actual matching 
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data items. The members of those static categories (which 
may or may not satisfy the search criteria) can then be 
accessed through the categories. 

In the context of the Internet, for example, all web pages 

s and web sites containing subject matter relating to the topic 
of baseball would be statically assigned to a baseball cat- 
egory. When a query containing the term "baseball" is 
processed, the baseball category is displayed, instead of or 
in addition to, all of the individual web pages that satisfy the 

30 query terms. A user can then select the baseball category to 
view the web pages and web sites assigned to the baseball 
category. Categories containing a large number of search- 
able data items can be divided into sub-categories to create 
a statically-defined category hierarchy. 

is Although the static category approach is helpful in allow- 
ing a user to navigate through a large number of searchable 
data items in an organized manner, it suffers from several 
drawbacks. First, if the amount of information being 
searched is large, a large amount of resources can be 

20 required to pre-assign all of the searchable data items to 
categories. Furthermore, when the searchable data set 
changes, the category assignments must be updated to reflect 
the changes. For example, if new searchable data items are 
added to the searchable data set and the categories are not 

25 updated to reflect the new searchable data items, then a user 
cannot access the new searchable data items through the 
categories. As a result, the new searchable data items that 
cannot be accessed through the categories are effectively 
lost. 

30 Another drawback to the static category approach is that 
the statically-defined categories may not be helpful in find- 
ing information that does not fit squarely into the predefined 
categories. Thus, a search may result in the display often 
categories, where each of the ten categories has a relatively 

35 low degree of relevance. 

These problems are particularly acute on the Internet for 
at least two reasons. First, the Internet provides access to a 
vast amount of information which requires an enormous 
amount of resources to assign searchable data items to 

40 categories. Secondly, the information available through the 
Internet is constantly changing and new information is being 
added at an astounding rate. Consequently, a large amount of 
resources is required to maintain static categories that do not 
necessarily reflect all of the searchable data set Therefore, 

45 based upon the need to present a large number of matching 
data items in an organized manner and the limitations of 
prior approaches, an approach for presenting a large number 
of matching data items in an organized manner that does not 
suffer from the limitations of prior approaches is highly 

50 desirable. 

SUMMARY OF THE INVENTION 

According to one aspect of the invention, a method is 
provided for presenting search results using dynamic cat- 

55 egorization. The method comprises the steps of receiving 
search results, dynamically establishing one or more search 
result categories based upon attributes of the search results 
and presenting one or more category identifiers correspond- 
ing to the one or more search result categories. 

60 According to another aspect of the invention, a method is 
provided for presenting search results on a user interface 
using dynamic categorization. The method comprises the 
steps of dynamically establishing one or more search result 
categories based upon attributes of the search results and 

65 displaying on the user interface one or more interface 
objects corresponding to the one or more search result 
categories. 
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According to another aspect of the invention, a computer 
system is provided for presenting search results to a user 
using dynamic categorization. The computer system com- 
prises a user interface, one or more processors and a memory 
coupled to the one or more processors. The memory contains 
one or more sequences of one or more instructions which, 
when executed by the one or more processors, cause the 
computer system to perform the steps of receiving search 
results, dynamically establishing one or more search result 
categories based upon attributes of the search results and 
displaying on the user interface one or more category 
indicators corresponding to the one or more search result 
categories. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Embodiments of the invention are illustrated by way of 
example, and not by way of limitation, in the figures of the 
accompanying drawings and in which like reference numer- 
als refer to similar elements and in which: 

FIG. 1 is a high-level flow chart illustrating an approach 
for presenting search results using dynamic categorization 
according to an embodiment of the invention; 

FIG. 2 is a detailed flow chart illustrating an approach for 
presenting search results using dynamic categorization 
according to another embodiment of the invention; 

FIG. 3A is a block diagram illustrating a user interface for 
presenting search results using dynamic categorization 
according to an embodiment of the invention; 

FIG. 3B is a block diagram illustrating a user interface for 
presenting search results using dynamic categorization and 
sub -categories according to an embodiment of the invention; 

FIG. 3C is a block diagram illustrating a user interface for 
presenting search results using dynamic categorization and 
user-selectable categories according to an embodiment of 
the invention; and 

FIG. 4 is a block diagram of a computer system on which 
embodiments of the invention may be implemented. 

DETAILED DESCRIPTION OF THE 
INVENTION 

In the following description, for the purposes of 
explanation, specific details are set forth in order to provide 
a thorough understanding of the invention. However, it will 
be apparent that the invention may be practiced without 
these specific details. In other instances, well-known struc- 
tures and devices are depicted in block diagram form in 
order to avoid unnecessarily obscuring the invention, 

FUNCTIONAL OVERVIEW 

In general, search results are presented using dynamic 
categorization. Dynamic categorization involves examining 
search results and dynamically establishing one or more 
search result categories based upon attributes of the search 
results. As described in more detail hereinafter, a varied of 
grouping or clustering techniques may be used to dynami- 
cally establish the search result categories. The search result 
categories are then presented using category indicators, as 
described in more detail hereinafter. 

Dynamic categorization allows search result categories to 
be generated on a search-by-search basis while ensuring that 
all matching data items are assigned to at least one search 
result category. As a result, categories may be tailored to 
each set of search results and based on user or application 
preferences. Dynamic categorization may be used in com- 
bination with static categories to provide a hybrid category 
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hierarchy. Finally, dynamic categorization may be used to 
modify search queries, as described in more detail herein- 
after. 

FIG. 1 is a flow chart 100 illustrating an approach for 
5 presenting search results using dynamic categorization 
according to an embodiment of the invention. After starting 
in step 102, in step 104 search results are received. In step 
106, the search results are examined and one or more search 
result categories are dynamically established based upon 
10 attributes of the matching data items that satisfy the query. 
In step 108, the search results are presented to a user based 
upon the one or more search result categories, as described 
in more detail hereinafter. Finally, the process is complete in 
step 110. 

15 1. DYNAMICALLY DETERMINING CATEGORIES 

Dynamically determining categories involves identifying 
similarities and/or dissimilarities of attributes in the match- 
ing data items and establishing a set of candidate categories 
based upon the identified similarities and/or dissimilarities. 

20 The nature of the attributes used to determine similarities 
and/or dissimilarities may differ based on the nature of the 
matching data items. For example, if the matching data 
items are structured records, the attributes used to determine 
the categories may be selected fields of the structured 

25 records. On the other hand, if the matching data items are 
relatively unstructured text-based electronic documents, 
then the attribute values used to determine categories may 
simply be similarity coefficients that have been generated 
based on comparisons between the text contents of the 

30 documents. 

The candidate categories may be filtered or otherwise 
processed to select an appropriate number of final categories 
from the candidate categories. In situations where the num- 
ber of candidate categories is sufficiently small, the filtering 

35 may not be necessary. Ideally, the number of final categories 
is selected so that when the final categories are presented to 
a user, the user can review the final categories in a relatively 
short period of time. Accordingly, the actual number of final 
categories necessarily depends upon both the requirements 

40 of a particular application and the way in which the final 
categories are presented to the user. 

Once the final categories are determined, the matching 
data items are assigned to the final categories and the final 
categories are presented to the user. The steps of determining 

45 candidate categories, determining final categories based 
upon the candidate categories and assigning the matching 
data items to the final categories are collectively referred to 
as "clustering." The particular clustering technique used 
depends upon the particular requirements of an application 

50 and the invention is not limited to any particular clustering 
technique. Examples of clustering techniques include Baye- 
sian clustering, neural networks, Jaccard similarity 
coefficients, semantic analysis and various natural language 
processing algorithms. The particular clustering algorithm 

55 used may be user-defined. 

The approach of presenting search results using dynamic 
categorization is now described with reference to the flow 
chart 200 of FIG. 2. After starting in step 202, in step 204 
search results are received. The particular way in which a 

60 search is performed is not germane to embodiments of the 
invention and embodiments of the invention are not limited 
to any particular type of search. 

In step 206, a determination is made as to whether initial 
criteria are satisfied. According to one embodiment of the 

65 invention, the initial criteria include a minimum number of 
search results. If the number of matching data items are 
below a minimum threshold, then dynamic categorization is 
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not used and traditional presentation approaches are used In step 220, the qualifying data items are assigned to the 

instead. Another example of the initial criteria is whether the categories. For example, the hits having the compact car 

search results consist of data from more than one data source attribute are assigned to the compact car category. For hits 

(e.g. different databases, such as a real time query and a having attributes of categories that were collapsed into 

static database query), where dynamic categorization is used 5 broader categories, those hits are assigned to the broader 

to combine the data from the different sources to be pre- category. For example, if the mid-size car and fill size car 

sented to the user. If the initial criteria are not satisfied, then categories are collapsed into a single full size car category, 

the process is complete in step 224. then all of the hits having the mid -size car attribute are 

If, however, in step 206, a determination is made that the included in the full size car category. In step 222, the 

initial criteria are satisfied, then in step 208 the matching 10 categories and qualifying data items are presented to the 

data items (search results) are filtered to generate filtered user, as described in more detail hereinafter. The process is 

search results. According to one embodiment of the complete in step 224, 

invention, the matching data items are filtered by a relevance In steps 214 and 216, more than one algorithm may be 

threshold. Traditional search techniques provide a relevancy used to produce a number of groupings according to one 

rating for search results that indicates how well individual 15 embodiment of the invention, an optimal grouping may be 

matching data items satisfy the search criteria In situations selected as the grouping presented to the user. An optimal 

where a query results in a large number of matching data grouping is typically determined based upon the require- 

items, it is often useful to reduce the amount of matching ments of a particular application. For example, grouping by 

data items by discarding matching data items that do not one attribute may produce more categories than grouping by 

satisfy a minimum relevance threshold. 20 another attribute. Conversely, some groupings may cluster 

For example, for particular search results containing a results with similar relevance scores (which may be inde- 

large amount of data, all matching data items having a pendent of the categorization criteria). This may be more 

relevancy of less than fifty percent might be discarded. preferable in some circumstances than groupings with 

According to another embodiment of the invention, a par- smaller number of categories. 

ticular number of the most relevant hits are retained, with the 25 An application can also have access to the different 

remaining hits being discarded. For example, suppose a groupings formed during steps 214 and 216, so that the 

determination is made that at most one hundred hits are application or the user may elect to view a different grouping 

desired. A particular search is performed and the search other than the one initially selected for presentation. This 

results include twenty thousand hits. In this situation the ability to take different views of what is basically the sane 

relevancy ratings for the matching data items are used to 30 large collection of data is akin to doctors using X-ray, MRI, 

identify and keep the one hundred most relevant hits and and CatScan to look at the same tumor in different ways in 

discard the remaining nineteen thousand, nine hundred hits. order to understand it better. 

For the purpose of explanation, the matching data items 2. PRESENTING SEARCH RESULTS 

that are not discarded during the filtering process are FIG. 3 A illustrates a user interface 300 for presenting 

referred to herein as qualifying data items. Thus, in the 35 search results using dynamic categorization according to an 

example given above, the query resulted in twenty thousand embodiment of the invention. User interface 300 may be 

matching data items, but only one hundred qualifying data implemented in any combination of discrete hardware cir- 

items. cuitry and computer software. Typically, user interface 300 

In step 210, the qualifying data items are optionally sorted is provided as a graphical representation on a computer 

by one or more attributes to generate sorted search results. 40 screen that is generated by the execution of sequences of 

For example, in the context of search results that include instructions by one or more processors, 

addresses, the search results might be sorted by zip code. Categories that are dynamically determined in accordance 

In step 212, common attribute values among the qualify- with embodiments of cw the invention are presented using 

ing data items are identified. The common attribute values category indicators. A category indicator is any object that is 

are specific to each set of search results. For example, for 45 capable of representing a category. Since the invention is not 

search results pertaining to automobiles, common attribute limited to any particular medium for presenting search 

values may include compact cars, mid-size cars, fill size results, the type of category indicator may vary depending 

cars, and sports cars. upon the requirements of a particular application. For 

In step 214, similarity data is determined for the search example, for presenting search results on a user interface, a 

results that indicates the occurrence of the common attribute 50 user interface object may be used as a category indicator, 

values among the qualifying data items. For example, the The user interface object may provide some indicia that it 

similarity data would indicate how many of the hits in the corresponds to a particular category of search results, 

filtered search results have the attribute values of compact dynamically determined in accordance with embodiments of 

cars, mid-size cars, full size cars, and sports cars, respec- the invention. For presenting search results in a data file or 

tively. In step 216, the search results are grouped based upon 55 on a printer, a category indicator may include a text string 

the similarity data. For example, the qualifying data items identifying the corresponding category, 

having the compact car attribute value are grouped together Referring to the prior example of search results pertaining 

and the hits in the search results having the mid-size car to automobiles, user interface 300 includes three category 

attribute value are grouped together. indicators 302, 304 and 306 that correspond to the 

In step 218, one or more categories are selected based 60 dynamically-determined categories previously described, 
upon the groupings. According to one embodiment of the Category indicator 302 corresponds to the category "auto- 
invention, the one or more categories are selected by a mobiles: compact cars" and includes two qualifying data 
majority vote. Specifically, the categories having the most items from the search results, designated by the reference 
qualifying data items are selected. Categories having rela- numeral 308. Qualifying data items 308 include compact 
tively few numbers of qualifying data items are collapsed 65 cars "Tango" and "Foxtrot". Category indicator 304 corre- 
into broader categories, so as to reduce the total number of sponds to the category "Automobiles: Full Size Cars" that 
selected categories. includes qualifying data items 310. Qualifying data items 
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310 include full size cars, "Zebra," "Elephant" and "Rhino." assigning the scores to the categories, (e.g. scores are very 

Category indicator 306 corresponds to the category "Auto- similar), another ordering (such as alphabetical) may be used 

mobiles: Sports Cars" that includes a qualifying data item as a tie breaker. Again, the user and the application may have 

312. Qualifying data item 312 is a sports car "Spark." complete control on which algorithm is used, and can select 

For purposes of illustration, in FIG. 3A the qualifying data 5 different algorithms, 

items 308, 310, 312 and 314 are displayed with their 3 - SUB-CATEGORIES 

respective category indicators 302, 304 or 306. However, Dynamic categorization may also be used to generate 

according to another embodiment of the invention, qualify- sub-categories. Generating sub-categories is particularly 

ing data items 308, 310, 312 and 314 are not initially usefiil when a category has a large number of hits. For 

displayed. Rather, only category indicators 302, 304 and 306 10 ex ? m P le ' refe ™g t0 in the situation where the 

. .. „ , , f. a *u * r- f *• category corresponding to category indicator 304 contains a 

are initially displayed to reduce the amount of information y % £ f h{ * sub _ c * ories are ated and 

°, Q JIVZ ^ 3 ,?; ^ jWwe.qualifyiflg data items su5 * at indicators 316 and 318 corresponding to the 

308,310,312 and 314 are displayed in response to a user sub -categories are presented on user interface 300. The 

selection of category indicators 302, 304 and 306. For subcategories corresponding to sub-category indicators 316 

example, in response to a user selection of category indicator 35 an d 313 are generated based upon attributes of qualifying 

302, qualifying data items 308 are displayed. In response to data items 310 contained in the category corresponding to 

another user selection (de-selection) of category indicator category indicator 304. 

302, qualifying data items 308 are undisplayed from user In the present example, qualifying data items 310 have a 

interface 300. This is particularly helpful when category price attribute which is used to generate the sub-categories 

indicator 302 contains a sufficiently large number of quali- 20 that correspond to sub-category indicators 316 and 318. 

fying data items 308 such that other category indicators 304 Specifically, the sub-category corresponding to subcategory 

and 306 cannot be displayed simultaneously with the mem- indicator 316 is generated for bits having a price attribute of 

bers of the category associated with category indicator 302. less than $25,000. In the present example, this sub-category 

User interface 300 also includes an indicator 314 identi- includes entries 320 "Zebra" and "Elephant." On the other 

fied as "<more in this category>." In response to the 25 hand, the sub-category corresponding to sub-category indi- 

selection of indicator 314 by a user, additional hits in the cator 318 is generated for hits having a price attribute of 

category corresponding to category indicator 304 are dis- more than $25,000. This sub-category includes a hit 322 

played on user interface 300. Indicator 314 provides the "Rhino." The sub-category corresponding to sub-category 

benefit of informing a user that additional hits for the indicator 318 also includes a hit 324 designated as "<more 

category corresponding to category indicator 304 are 30 in this category>" that provides access to additional hits in 

available, without over-cluttering user interface 300. sub-category 318. 

For example, if qualifying data items 308, 310 and 312 are According to one embodiment of the invention, sub- 
structured records, the text titles may be derived from fields category indicators 316 and 318 and hits 318, 320 and 322 
in the structured records. In the present example, both of the are not initially displayed under category indicator 304. In 
qualifying data items 308, namely "Tango" and "Foxtrot" 35 response to a user selection of category indicator 304, 
may have a "compact car" field. In circumstances where sub -category indicators 316 and 318 are displayed, but not 
qualifying data items 308, 310 and 312 are relatively hits 318, 320 and 322. Then, in response to a user selection 
unstructured text-based electronic documents, then category of subcategory indicators 316 and 318, hits 318, 320 and 322 
indicators 302, 304 and 306 may not be displayed at all. are displayed, respectively. This is particularly helpful when 
Instead, the first qualifying data item in quali fying data items 40 the category corresponding to category indicator 304 con- 
308, 310 and 312, namely "Tango," "Zebra," and "Spark" tains a large number of hits. Sub-category indicators 316 and 
would be displayed on user interface 300 followed by a 318 may also be de-selected and undisplayed as previously 
user-selectable "<more like this>" indicator. This approach described with respect to category indicators 302, 304 and 
displays a representative qualifying data item in qualifying 306. 

data item 308, 310 and 312 while allowing a user to easily 45 4. USER-SELECTABLE CATEGORIES 

view the remaining qualifying data items by selecting the According to another embodiment of the invention, a set 

"<more like this>" indicator. The text titles provided with of one or more candidate categories are presented to a user 

category indicators 302, 304 or 306 are derived from and the user is permitted to select one or more of the 

attributes of their respective qualifying data items 308, 310 candidate categories, and/or one or more sets of candidate 

and 312. 50 categories, to be used as the final categories to present the 

Categories within a group may be presented to users in search results. Once the user selects the final categories, the 

any order. However, some orderings may be preferable to qualified data items are assigned to the final categories and 

others. For example, a group by unit price range may be the final categories and search results are presented to the 

more suitably displayed initially sorted by price range. A user. 

common way of presenting groups during "fuzzy" searches 55 As illustrated in FIG. 3C, user interface 300 includes a set 
(where matches aren't exact) is by relevance. A category of user-selectable category indicators 330 corresponding to 
relevance rating can be calculated for each category, and the categories that have been determined using the dynamic 
categories can then be presented in relevance sorted order. categorization approach described herein A user may select 
Category relevance can be calculated in any number of one or more of the user-selectable category indicator 330 to 
ways depending on the requirements of a particular appli- 60 be used in presenting the search results to the user. This 
cation. One way is to assign the highest relevance score of provides a user with the flexibility to choose specific cat- 
any item in the category as the category's score. This has the egories to be used to categorize the search results. User 
effect of elevating groups containing at least one high interface 300 also includes a set of hit counts 332 that 
scoring item to the top. Another way is to assign the average indicate how many hits are assigned to each of the user- 
score of all items in the category as the category's score. Yet 65 selectable categories 330. The hit counts 332 provide infor- 
another way is to use the median, or a weighted average. In mation that may help the user determine which of the 
the case where there isn't a clear ordering even after user-selectable categories he or she might want to chose. 
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According to one embodiment, the user may select one or 
more sets of categories, where the categories within one set 
are established based on different attributes than the catego- 
ries within the other sets. For example, one set of categories 
may group cars according to their size, while another set of 
categories groups cars according to their price range, while 
yet another set of categories groups cars according to their 
manufacturer. The user may then select specific categories 
from one or more of the category sets on a category by 
category basis, or on an entire category-set by category-set 
basis. 

Significantly, when some final categories are generated 
based on different attributes than other final categories, then 
it is possible for the same qualifying data item to be assigned 
to more final one of the final categories. For example, if a 
user selects a particular car size category as a final category, 
a particular price range category as a final category, and a 
particular manufacturer category as a final category, it is 
possible for a qualifying data item that contains information 
about a particular car to fall into all three of the selected 
categories. 

5. USING DYNAMIC CATEGORIZATION WITH 
STATIC CATEGORIES 

Dynamic categorization may also be used with static 
categories. Using dynamic categorization with static catego- 
ries is particularly helpful when a static category includes a 
large number of hits. Under these circumstances, dynamic 
categorization may be used to determine one or more 
sub-categories to organize the hits contained in the particular 
category. Dynamic categorization is also particularly helpful 
when certain hits are not assigned to any static categories. 
These hits are often referred to as "orphan hits." Additional 
categories may be generated for the orphan hits using the 
dynamic categorization approach described herein. 

For example, referring to FIG. 3B, suppose that category 
indicator 304 is a static category that contains a large 
number of hits. Under these circumstances, dynamic cat- 
egorization is useful to dynamically determine sub- 
categories, as previously described, to provide additional 
organization to the hits contained in the static category 
corresponding to static category indicator 304. If the sub- 
categories contain too many hits, then additional sub- 
categories may be generated. The additional sub-categories 
may be added to static category associated with category 
indicator 304 or to the sub-categories associated with sub- 
category indicators 316 and 318. 

6. MODIFYING SEARCH CRITERIA USING DYNAMIC 
CATEGORIZATION 

Dynamic categorization may also be used to modify 
search criteria to be used in subsequent searches. A search 
query may be modified (broadened or narrowed) based upon 
dynamic categories determined by dynamic categorization. 
Specifically, query terms that correspond to dynamic cat- 
egories may be added to a search query, replace existing 
query terms or be used instead of existing query terms. For 
example, suppose in the prior example the original query 
was "automobile". The original query may be modified to 
add the term "sports cars" to form a new query "automobile 
AND sports cars" when the user selects the category iden- 
tifier for the dynamically determined "sports car" category. 
As another example, the original query may be modified to 
just "sports cars". Care must be taken not to overly narrow 
a search query by adding in too many terms associated with 
dynamic categories. For example, the search query "auto- 
mobiles AND compact cars AND full size cars AND sports 
cars" may not yield any search results. Each category may 
optionally have keywords associated with it which can be 
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used in narrowing the search (used as AND or OR terms). 
The keywords can be statically defined in a dictionary, or 
may be dynamically generated by looking for the most 
common words in items in each category. It may be advan- 

5 tageous to use AND terms more sparingly than OR terms 
since they may overly limit the search. 

The invention is not limited in its application to any 
particular type of search results. Rather, dynamic categori- 
zations may be used with any type of search results. Further, 

10 although dynamic categorization has been described herein 
primarily in the context of categorizing search results from 
a new search, dynamic categorization may also be used with 
portions of search results. For example, dynamic categori- 
zation may be applied to a locally cached portion of search 

is results and optionally extended to the remaining portions of 
the search results, i.e. the portions of the search results that 
are remotely stored. In addition, the approach described 
herein may be applied to locally cached search results that 
are periodically updated by background search processes. 

20 Thus, the approach described herein may be applied to any 
portion of search results. 

Embodiments of the invention are also applicable to 
real-time search applications where after a query is 
processed, matching data items are received and categories 

25 have already been dynamically determined as described 
herein, additional matching data items are received. In this 
circumstance, the additional matching data items are exam- 
ined and added to the existing categories if possible. For 
example, additional matching data items that have attributes 

30 that are sufficiently similar to attributes of the existing 
categories can be added to those categories. The additional 
matching data items that cannot be assigned to existing 
categories may be retained as part of the search results and 
included in the next dynamic categorization. As a result, 

35 when a user elects to re-categorize, then all of the additional 
matching data items may be assigned to categories. 
7, IMPLEMENTATION MECHANISMS 

The approach for presenting search results using dynamic 
categorization as described herein may be implemented in 

40 discrete hardware circuitry, in computer software, or a 
combination of discrete hardware circuitry and computer 
software. 

FIG. 4 is a block diagram that illustrates a computer 
system 400 upon which embodiments of the invention may 

45 be implemented. Computer system 400 includes a bus 402 
or other communication mechanism for communicating 
information, and a processor 404 coupled with bus 402 for 
processing information. Computer system 400 also includes 
a main memory 406, such as a random access memory 

50 (RAM) or other dynamic storage device, coupled to bus 402 
for storing information and instructions to be executed by 
processor 404. Main memory 406 also may be used for 
storing temporary variables or other intermediate informa- 
tion during execution of instructions to be executed by 

55 processor 404. Computer system 400 further includes a read 
only memory (ROM) 408 or other static storage device 
coupled to bus 402 for storing static information and instruc- 
tions for processor 404. A storage device 410, such as a 
magnetic disk or optical disk, is provided and coupled to bus 

60 402 for storing information and instructions. 

Computer system 400 may be coupled via bus 402 to a 
display 412, such as a cathode ray tube (CRT), for displaying 
information to a computer user. An input device 414, includ- 
ing alphanumeric and other keys, is coupled to bus 402 for 

65 communicating information and command selections to 
processor 404, Another type of user input device is cursor 
control 416, such as a mouse, a trackball, or cursor direction 
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keys for communicating direction information and com- integrated services digital network (ISDN) card or a modem 
mand selections to processor 404 and for controlling cursor to provide a data communication connection to a corre- 
movement on display 412, This input device typically has sponding type of telephone line. As another example, corn- 
two degrees of freedom in two axes, a first axis (e.g., x) and munication interface 418 may be a local area network 
a second axis (e.g., y), that allows the device to specify 5 (LAN) card to provide a data communication connection to 
positions in a plane. a compatible LAN. Wireless links may also be implemented. 

The invention is related to the use of computer system 400 In any such implementation, communication interface 418 

for presenting search results using dynamic categorization. sends and receives electrical, electromagnetic or optical 

According to one embodiment of the invention, the presen- signals that carry digital data streams representing various 

tation of search results using dynamic categorization is io types of information. 

provided by computer system 400 in response to processor Network link 420 typically provides data communication 

404 executing one or more sequences of one or more through one or more networks to other data devices. For 

instructions contained in main memory 406. Such instruc- example, network link 420 may provide a connection 

tions may be read into main memory 406 from another through local network 422 to a host computer 424 or to data 

computer-readable medium, such as storage device 410. 15 equipment operated by an Internet Service Provider (ISP) 

Execution of the sequences of instructions contained in main 426. ISP 426 in turn provides data communication services 

memory 406 causes processor 404 to perform the process through the world wide packet data communication network 

steps described herein. One or more processors in a multi- now commonly referred to as the "Internet" 428. Local 

processing arrangement may also be employed to execute network 422 and Internet 428 both use electrical, electro- 

the sequences of instructions contained in main memory 20 magnetic or optical signals that carry digital data streams. 

406. In alternative embodiments, hard-wired circuitry may The signals through the various networks and the signals on 

be used in place of or in combination with software instruc- network link 420 and through communication interface 418, 

tions to implement the invention. Thus, embodiments of the which carry the digital data to and from computer system 

invention are not limited to any specific combination of 400, are exemplary forms of carrier waves transporting the 

hardware circuitry and software. 25 information. 

The term "computer-readable medium" as used herein Computer system 400 can send messages and receive 

refers to any medium that participates in providing instruc- data, including program code, through the network(s), net- 

tions to processor 404 for execution. Such a medium may work link 420 and communication interface 418. In the 

take many forms, including but not limited to, non-volatile Internet example, a server 430 might transmit a requested 

media, volatile media, and transmission media. Non-volatile 30 code for an application program through Internet 428, SP 

media includes, for example, optical or magnetic disks, such 426, local network 422 and communication interface 418. In 

as storage device 410. Volatile media includes dynamic accordance with the invention, one such downloaded appli- 

memory, such as main memory 406. Transmission media cation provides for presenting search results using dynamic 

includes coaxial cables, copper wire and fiber optics, includ- categorization as described herein. 

ing the wires that comprise bus 402. Transmission media can 35 The received code may be executed by processor 404 as 

also take the form of acoustic or light waves, such as those it is received, and/or stored in storage device 410, or other 

generated during radio wave and infrared data communica- non-volatile storage for later execution. In this manner, 

tions. computer system 400 may obtain application code in the 

Common forms of computer-readable media include, for form of a carrier wave, 

example, a floppy disk, a flexible disk, hard disk, magnetic 40 The approach for presenting search results using dynamic 

tape, or any other magnetic medium, a CD-ROM, any other categorization as described herein provides several advan- 

optical medium, punch cards, paper tape, any other physical tages over prior approaches for presenting search results, 

medium with patterns of holes, a RAM, a PROM, and First, a large number of search results can be presented to a 

EPROM, a FLASH-EPROM, any other memory chip or user in an organized manner without the loss of information, 

cartridge, a carrier wave as described hereinafter, or any 45 This eliminates the need to reduce the amount of search 

other medium from which a computer can read. results by narrowing search criteria In addition, since 

Various forms of computer readable media may be dynamically-determined categories are based upon the 
involved in carrying one or more sequences of one or more attributes of particular search results, the dynamically deter- 
instructions to processor 404 for execution. For example, the mined categories are customized to each set of search 
instructions may initially be carried on a magnetic disk of a 50 results. In particular, this allows unique sets of sub- 
remote computer. The remote computer can load the instruc- categories to be generated for each set of search results, 
tions into its dynamic memory and send the instructions over Furthermore, the approach for presenting search results 
a telephone line using a modem. A modem local to computer using dynamic categorization as described herein may be 
system 400 can receive the data on the telephone line and implemented using any type of clustering technique. Finally, 
use an infrared transmitter to convert the data to an infrared 55 dynamically-determined categories can be used to modify 
signal. An infrared detector coupled to bus 402 can receive search criteria to aid in subsequent searches, 
the data carried in the infrared signal and place the data on In the foregoing specification, the invention has been 
bus 402. Bus 402 carries the data to main memory 406, from described with reference to specific embodiments thereof. It 
which processor 404 retrieves and executes the instructions. will, however, be evident that various modifications and 
The instructions received by main memory 406 may option- 60 changes may be made thereto without departing from the 
ally be stored on storage device 410 either before or after broader spirit and scope of the invention. The specification 
execution by processor 404. and drawings are, accordingly, to be regarded in an illus- 

Cbmputer system 400 also includes a communication trative rather than a restrictive sense, 

interface 418 coupled to bus 402. Communication interface What is claimed is: 

418 provides a two-way data communication coupling to a 65 1- A method for presenting search results, the method 

network link 420 that is connected to a local network 422. comprising the steps of: 

For example, communication interface 418 may be an receiving search results; 
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dynamically establishing one or more search result cat- 11. The method as recited in claim 10, further comprising 

egories based upon attributes of the search results by the step of in response to a user selection, presenting search 

identifying common attributes among the search results associated with the one or more sub-categories. 

results, 12. A method for presenting search results comprising the 

generating a set of one or more coefficients that reflect 5 s t e p S of* 

the similarity or dissimilarity of the search results Tccciving ^arch resulls; 
based upon the common attributes, ° 
grouping the search results based upon the set of one or dynamically establishing one or more search result cat- 
more coefficients, and egories based upon attributes of the search results; 
selecting the one or more categories based upon the JQ presenting one or more category identifiers corresponding 
grouping of the search results; and to the one or more search result categories; and 
presenting one or more category identifiers corresponding presenting one or more static category identifiers corre- 
to the one or more search result categories. sponding to one or more static search result categories. 

2. l^e method as recited in claim 1, wherein every 13 The melhod as redted in claim u farther mmpTising 
member of the one or more search result categories is a data ^ ^ Q ^ 

item that satisfies criteria specified in a query that produced 15 . ' - , 

the search results presenting first search results corresponding to the one or 

3. The method as recited in claim 1, wherein the step of more ^ rch result categories, and 

identifying common attributes among the search results is presenting second search results corresponding to the one 

performed using Bayesian clustering techniques. or more static search result categories. 

4. The method as recited in claim 1, wherein the step of 20 14. A method for presenting search results comprising the 
identifying common attributes among the search results is steps of: 

performed using a neural network. in reS p 0nse to a user selection of one or more of the one 

5 The method as recited m clam! 1, wherein Qf ^ candidate identifiers , establishing one 

the coefficients are Jaccard coefficients, and Qf more final search ^ categories based upon ^ 

the step of generating a set of one or more coefficients that 25 lhe Qne 0f mofe candidale ^ ch result cate gories and 

reflect the similarity of the search results based upon the uger ^1^^!,. and 

the common attributes includes the step of generating a ' 

set of one or more Jaccard coefficients that reflect the presenting one or more final category identifiers corre- 

similarity of the search results based upon the common sponding to the one or more final search result catego- 

attributes. 30 nes - 

6. The method as recited in claim 1, wherein 15- A method for presenting search results on a user 
the search results are first search results, interface, the method comprising the steps of: 

the method further comprises the step of applying rel- displaying on the user interface one or more user interface 

evance criteria to the first search results to generate objects corresponding to the one or more search result 

second search results that satisfy the relevance criteria, 35 categories that have been dynamically established 

and based upon attributes of the search results; and 

the step of dynamically establishing one or more search displaying on the user interface one or more user interface 

result categories based upon attributes of the search objects corresponding to one or more static categories, 

results includes the step of dynamically establishing 16 The method as redted in daim 15? comprising 

one or more search result categories based upon 4Q the step 0 f reS ponding to a user selection of a particular user 

attributes of the second search results. mterface object from ^ Qne QT mQK ^ mterface objects 

7. The method as recited in claim 1, wherein by displaymg on the user interface search results associated 
the method further comprises the step of sorting the ^ a part i C ul ar search result category corresponding to the 

search results by the attributes of the search results to particular user interface object. 

generate sorted search results, and 45 17 met hod as recited in claim 15, further comprising 
the step of dynamically establishing one or more search the step of in response to a first user selection of a first user 
result categories based upon attributes of the search interface object from the one or more user interface objects, 
results includes the step of dynamically establishing displaying on the user interface one or more sub-category 
one or more search result categories based upon user interface objects corresponding to one or more sub- 
attributes of the sorted search results. 50 categories, wherein the one-or-more sub-categories are asso- 

8. The method as recited in claim 1, wherein the search ciated with the category corresponding to the first user 
results include a plurality of matching data items and the interface object, the one or more sub-categories being 
method further comprises the step of assigning the matching dynamically determined based upon the attributes of the 
data items to the one or more search result categories. search results. 

9. The method as recited in claim 1, further comprising 5S 18. The method as recited in claim 17, further comprising 
the step of in response to a user selection, presenting search the step of in response to a second user selection of the first 
results associated with the one or more search result cat- user interface object, undisplaying from the user interface 
egories. the one or more sub-category user interface objects. 

10. The method as recited in claim 1, wherein the method 19. The method as recited in claim 17, further comprising 
farther comprises the steps of 60 tne step of in response to a second user selection of the one 

dynamically establishing one or more search result sub- or more sub-category user interface objects, displaying on 

categories based upon both the one of said search result the user interface search results associated with the one or 

categories and the search results that belong to said one more sub-categories corresponding to the sub-category user 

of said search result categories, and interface objects. 

presenting one or more sub-category identifiers corre- 65 20. The method as recited in claim 19, further comprising 

sponding to the one or more search result sub- the step of in response to a fourth user selection of the one 

categories. or more sub -category user interface objects, undisplaying 
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from the user interface the search results associated with the 
one or more sub-categories corresponding to the sub- 
category user interface objects. 

21 . A computer system for presenting search results to a 
user, the computer system comprising: 

a user interface; 

one or more processors; and 

a memory commutatively coupled to the one or more 
processors and containing one or more sequences of 
one or mote instructions which, when executed by the 
one or more processors, cause the computer system to 
perform the steps of 
receiving search results, 

dynamically establishing one or more search result cat- 
egories based upon attributes of the search results by 
identifying common attributes among the search 
results, 

generating a set of one or more coefficients that reflect 
the similarity or dissimilarity of the search results 
based upon the common attributes, 

grouping the search results based upon the set of one or 
more coefficients, and 

selecting the one or more categories based upon the 
grouping of the search results; and 

displaying on the user interface the one or more cat- 
egory indicators corresponding to the one or more 
search result categories. 

22. The computer system as recited in claim 21, wherein 
every member of the one or more search result categories is 
a data item that satisfies criteria specified in a query that 
produced the search results. 

23. The computer system as recited in claim 21, wherein 
the step of identifying common attributes among the search 
results is performed using Bayesian clustering techniques. 

24. The computer system as recited in claim 21, wherein 
the step of identifying common attributes among the search 
results is performed using a neural network. 

25. The computer system as recited in claim 21, wherein 
the coefficients are Jaccard coefficients, and 

the step of generating a set of one or more coefficients that 
reflect the similarity of the search results based upon 
the common attributes includes the step of 
generating a set of one or more Jaccard coefficients that 

reflect the similarity of the search results based upon 

the common attributes. 

26. The computer system as recited in claim 21, wherein 
the search results are first search results, 

the memory system further comprises instructions for 
performing the step of applying relevance criteria to the 
first search results to generate second search results that 
satisfy the relevance criteria, and 

the step of dynamically establishing one or more search 
result categories based upon attributes of the search 
results includes the step of dynamically establishing 
one or more search result categories based upon 
attributes of the second search results. 

27. The computer system as recited in claim 21, wherein 
the memory fixer includes instructions for performing the 

step of sorting the search results by the attributes of the 
search results to generate sorted search results, and 
the step of dynamically establishing one or more search 
result categories based upon attributes of the search 
results includes the step of dynamically establishing 
one or more search result categories based upon 
attributes of the sorted search results. 
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28. The computer system as recited in claim 21, wherein 
the search results include a plurality of matching data items 
and the method farther comprises the step of assigning the 
matching data items to the one or more search result 
categories. 

29. The computer system as recited in claim 21, wherein 
the memory further includes instructions for performing the 
step of in response to a user selection, presenting search 
results associated with the one or more search result cat- 
egories. 

30. The computer system as recited in claim 21, wherein 
the memory further includes instructions for performing the 
steps of 

dynamically establishing one or more search result sub- 
categories based upon both the one of said search result 
categories and the search results that belong to said one 
of said search result categories, and 

presenting one or more sub-category identifiers corre- 
sponding to the one or more search result sub- 
categories. 

31. The computer system as recited in claim 30, wherein 
the memory further includes instructions for performing the 
step of in response to a user selection, presenting search 
results associated with the one or more sub-categories. 

32. A computer system for presenting search results 
comprising: 

one or more processors; and 

a memory communicatively coupled to the one or more 
processors and containing one or more sequences of 
one or more instructions which, when executed by the 
one or more processors, cause the one or more proces- 
sors to perform the steps of: 
receiving search results; 

dynamically establishing one or more search result 
categories based upon attributes of the search results; 

presenting one or more category identifiers correspond- 
ing to the one or more search result categories; and 

presenting one or more static category identifiers cor- 
responding to one or more static search result cat- 
egories. 

33. The computer system as recited in claim 32, wherein 
the memory further includes one or more additional instruc- 
tions which, when processed by the one or more processors, 
cause the one or more processors to perform the steps of 

presenting first search results corresponding to the one or 

more search result categories, and 
presenting second search results corresponding to the one 

or more static search result categories. 

34. A computer system for presenting search results 
comprising: 

one or more processors; and 

a memory communicatively coupled to the one or more 
processors and containing one or more sequences of 
one or more instructions which, when executed by the 
one or more processor cause the one or more processors 
to perform the steps of: 
receiving search results; 

dynamically establishing one or more candidate search 
result categories based upon attributes of the search 
results; 

presenting one or more candidate category identifiers 
corresponding to the one or more candidate search 
result categories; 

in response to a user selection of one or more of the one 
or more candidate category identifiers, establishing 
one or more final search result categories based upon 
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both the one or more candidate search result catego- results includes the step of dynamically establishing 

ries and the user selection; and one or more search result categories based upon 

presenting one or more final category identifiers cor- attributes of the sorted search results. 

responding to the one or more final search result 42. The computer-readable medium as recited in claim 35, 

categories. 5 wherein the search results include a plurality of matching 

35. A computer-readable medium carrying one or more data items and the method further comprises the step of 

sequences of one or more instructions for presenting search assigning the matching data items to the one or more search 

results to a user, the one or more sequences of one or more result categories. 

instructions including instructions which, when executed by 43 ^ computer-readable medium as recited m claim 35, 

one or more processors, cause the one or more processors to 10 wherein the computer-readable medium further includes 

perform the steps of* instructions for performing the step of in response to a user 

u n selection, presenting search results associated with the one 

receiving search results, . , 

. « . ■ i_ l4 t or more search result categories, 

dynamically establishing one or more search result cat- AA . , r, ,. .... , . 1B 

3 . , , .« r t_ t_ i. t_ 44. The computer-readable medium as recited in claim 35, 

egones based upon attributes ot the search results by r . , . 4 e e ■ tU „♦ c 

. f . F iL J . 1C further including instructions for performing the steps of 

identifying common attributes among the search " , . , 

results dynamically establishing one or more search result sub- 

generating a set of one or more coefficients that reflect categories based upon both the one of said search result 
the similarity or dissimilarity of the search results categories and the search results that belong to said one 
based upon the common attributes, of said search result categories, and 
grouping the search results based upon the set of one or 20 presenting one or more sub-category identifiers corre- 
more coefficients, and sponding to the one or more search result sub- 
selecting the one or more categories based upon the a categories. 

grouping of the search results; and 45 - n * computer-readable medium as recited in claim 44, 

j * « . • t _ f _ nna nr m „„ further including instructions for performing the step of in 

displaying on the user interface one or more category 4 & , . r . & , _ r 

?J nnt * frt ,i n „„ nr m „ r „ 25 response to a user selection, presenting search results asso- 

mdicators corresponding to the one or more search . f , . , , r , ^ . 
result cate ories ciated with the one or more sub-categones. 
36 rC 4 U e cLp g mer'readable medium as recited in claim 35, 46 ' A computer-readable medium for presenting search 
wherein every member of the one or more search result results ' the c ? m P uter readable medl ™ one or more 
categories is a data item that satisfies criteria specified in a ^ uenc f of one or more "^ructions which, when pro- 
query that produced the search results. 30 cessed ^ ° ne o^ore processors, cause the one or more 

37. The computer-readable medium as recited in claim 35, Processors to perform the steps of: 
wherein the step of identifying common attributes among receiving search results; 

the search results is performed using Bayesian clustering dynamically establishing one or more search result cat- 
techniques, egories based upon attributes of the search results, 

38. The computer-readable medium as recited in claim 35, 35 presenting one or more category identifiers corresponding 
wherein the step of identifying common attributes among to the one or more search result categories; and 

the search results is performed using a neural network, presenting one or more static category identifiers corre- 

39. The computer-readable medium as recited in claim 35, sponding to one or more static search result categories, 
wherein 47. The computer-readable medium as recited in claim 46, 

the coefficients are Jaccard coefficients, and 40 further including instructions for performing the steps of 

the step of generating a set of one or more coefficients that presenting first search results corresponding to the one or 

reflect the similarity of the search results based upon more search result categories, and 

the common attributes includes the step of presenting second search results corresponding to the one 

generating a set of one or more Jaccard coefficients that 45 or more static search result categories. 

reflect the similarity of the search results based upon 48. A computer-readable medium for presenting search 

the common attributes. results, the computer readable medium carrying one or more 

40. The computer-readable medium as recited in claim 35, sequences of one or more instructions which, when pro- 
wherein cessed by one or more processors, cause the one or more 

the search results are first search results, 5Q processors to perform the steps of: 

the computer-readable medium further includes instruc- receiving search results; 

tions for performing the step of applying relevance dynamically establishing one or more candidate search 

criteria to the first search results to generate second result categories based upon attributes of the search 

search results that satisfy the relevance criteria, and results; 

the step of dynamically establishing one or more search 55 presenting one or more candidate category identifiers 

result categories based upon attributes of the search corresponding to the one or more search result catego- 

results includes the step of dynamically establishing ries; and 

one or more search result categories based upon in response to a user selection of one or more of the one 

attributes of the second search results. or more candidate category identifiers, establishing one 

41. The computer-readable medium as recited in claim 35, 60 or more final search result categories based upon both 
wherein the one or more candidate search result categories and 

the computer-readable medium further includes instruc- the user selection; and 

tions for performing the step of sorting the search presenting one or more final category identifiers corre- 

results by the attributes of the search results to generate sponding to the one or more final search result catego- 

sorted search results, and 65 ries. 
the step of dynamically establishing one or more search 

result categories based upon attributes of the search ***** 
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